-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: optimize by corking outgoing requests #7946
Conversation
CI failed with some possibly related failures:
|
Makes you wonder if this block could benefit from being changed to multiple sends like we do with Buffers: chunk = len.toString(16) + CRLF + chunk + CRLF;
ret = this._send(chunk, encoding, callback); |
I do think those failures are related, so I'm going to do some looking into those. @ronkorving good idea — the string concatenation could be a problem. I might try that, thanks! I'm having slight second thoughts about my approach to this. I think the idea of corking and uncorking on the next tick (like #2020) should be in streams rather than HTTP so I'm going to put together an alternative PR with that in mind. This sort of stuff should also be available to TLS and such. |
@nodejs/http @nodejs/streams |
Maybe this might benefit from some string concatenation optimizations, e.g. https://github.com/davidmarkclements/flatstr. cc @davidmarkclements. @brendanashworth #2020 is unrelated to this case. I agree that |
lib/_http_outgoing.js
Outdated
if (this.socket) { | ||
// If the socket is corked from a write, uncork it before destroying it. | ||
if (this.socket._writableState.corked) | ||
this.socket.uncork(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct. Multiple cork()
call increase the corked
counter: https://github.com/nodejs/node/blob/master/lib/_stream_writable.js#L226-L230.
This needs to be a for loop that calls uncork()
until corked
is 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK that would be a backwards-incompatible change. It would change behavior in something like this:
(req, res) => {
res.cork();
res.write('should not write');
res.destroy();
}
Because Socket#destroy()
stops all IO on the socket, it isn't supposed to flush the last message. With this commit, writing to the stream will only cork it once, and thus we only uncork once, to leave the dev's previous cork-intent there. Looping until we uncork fully would write the message which I don't believe happens right now (edge case, yes 😛 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm lost. Why are you uncorking on destroy then?
Probably we shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm uncorking on destroy because, without it, it broke a test — leaving it corked would be backwards-incompatible as well. Unfortunately this is part of a gray area. There isn't any guarantee that the message would be flushed regardless, as net.js
may not be able to write it synchronously. If the message is buffered to be written asynchronously, and res.destroy()
is called before that happens, the message will not be written at all. It's an odd gray area, but this is the best way to keep code breakage down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which test was failing?
IMHO not transmitting the data when destroy() is hit is the correct behavior of destroy(). destroy
is a dangerous operation anyway. @mafintosh what do you think?
Beware that, if the socket was corked twice, we need to call enough uncork() for this to take into effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO not transmitting the data when destroy() is hit is the correct behavior of destroy().
Yes, that's how it behaves right now. .write()
can however flush the data synchronously if the socket is ready, so this has undefined behavior:
res.write('this may or may not send');
res.destroy();
The data may or may not be flushed. This has been around in node forever.
Beware that, if the socket was corked twice, we need to call enough uncork() for this to take into effect.
That's not what I'm doing here — we add only a single cork, so we uncork only once. If the user wants to cork the socket, write to it and destroy it, we shouldn't flush the message. That would be backwards-incompatible.
(edit): which test was failing?
test/parallel/test-http-abort-client.js
fails without this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO not transmitting the data when destroy() is hit is the correct behavior of destroy().
Yes, that's how it behaves right now. .write() can however flush the data synchronously if the socket is ready, so this has undefined behavior:
res.write('this may or may not send');
res.destroy();
The behavior of destroy()
is something we should define and clarify throughout core.
I'm ok with this particular change. Can you add a reference to the failing unit test in the comment? Otherwise it will be extremely hard to make the connection when looking at the code.
Beware that, if the socket was corked twice, we need to call enough uncork() for this to take into effect.
That's not what I'm doing here — we add only a single cork, so we uncork only once. If the user wants to cork the socket, write to it and destroy it, we shouldn't flush the message. That would be backwards-incompatible.
In this PR, every time write()
is called, this.connection.cork()
is called. It is entirely possible that _writableState.corked
is greater that one.
res.write('a')
res.write('b')
res.destroy() // will not uncork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior of destroy() is something we should define and clarify throughout core.
Not so much destroy()
, it's just the socket.write()
API is weird. It returns a boolean about whether or not it works like a sync or async function. It doesn't necessarily have to be fixed, it just has to be worked around because test/parallel/test-http-abort-client.js
and userland doesn't always use it perfectly. I'll add a better comment.
It is entirely possible that _writableState.corked is greater that one.
Ah, I think you're right. I'll fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just the socket.write() API is weird. It returns a boolean about whether or not it works like a sync or async function.
This is not true. socket.write()
is consistent from the point of view of the stream. It returns true
if the user can keep writing, and false
otherwise. From an implementation perspective, it will return true
for a while after "operations become asynchronous" (that's a simplification) because of the buffering.
Add a unit test for the "write-write-destroy" use case, similar to test/parallel/test-http-abort-client.js.
Good work! I think it is a good contribution! |
@mcollina thanks for your review! I like your judgement on the streams api, I might open an issue once this PR is 👍 |
I think this should be semver-major. Any other opinion? |
I happily defer to your judgement on the semveriness of it. Marking semver-major to be safe. |
@addaleax we are still waiting for some nits to be fixed. However, I would love to see this lands on v7. @brendanashworth how are you with this? Do you need any help? |
@mcollina doing good! It's not so much the nits that I need to spend time working on, but I need to investigate the test failures. I just haven't had the time to ask rvagg for access to one of the nodes yet — I'll try to get it working within the week. |
@brendanashworth Any updates on the suggestion I made? |
Just checking in! Keen on progress updates. |
@brendanashworth I would love to see this landed. Would you like somebody else to continue your work? Have you have more time to work on this? |
I can facilitate access to any test host if need be. Since CI results are long gone, I rebased (no conflicts) and started a new run: https://ci.nodejs.org/job/node-test-commit/5483/ |
Sorry about the lack of updates — I'll be doing the rest of the work on this over the long weekend 😄 happy to see that there's interest in landing this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if CI is green and benchmarks are good!
82dee6b
to
f3bba20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work!
LGTM with some nits to be fixed regarding comments.
lib/_http_outgoing.js
Outdated
// This is a shameful hack to get the headers and first body chunk onto | ||
// the same packet. Future versions of Node are going to take care of | ||
// this at a lower level and in a more general way. | ||
// Send the headers before the body. OutgoingMessage#write should cork |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove "should" here? They are corked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it 👍
lib/_http_outgoing.js
Outdated
} else if (data.length === 0) { | ||
this._flushOutput(connection); | ||
|
||
// Don't bother with an empty message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rephrase this? Something like "avoid writing an empty buffer"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion, not LGTM yet. There is the discussion around _writableState.corked
to be finalized.
f3bba20
to
1311f0e
Compare
@mcollina thank you for your very thorough review 😅 please take another look. @ronkorving thanks for your suggestion earlier — I've incorporated it into the latest changes. Regarding the CI and other platforms — there were problems with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be picky, but connectionCorkNT
is needed
lib/_http_outgoing.js
Outdated
process.nextTick(() => { | ||
this.connection.uncork(); | ||
this._corkedForWrite = false; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not allocate a closure here, it will slow things down considerably. Use connectionCorkNT
as it was done before.
c109972
to
ec42085
Compare
c133999
to
83c7a88
Compare
This commit opts for a simpler way to batch writes to HTTP clients into fewer packets. Instead of the complicated snafu which was before, now OutgoingMessage#write automatically corks the socket and uncorks on the next tick, allowing streams to batch them efficiently. It also makes the code cleaner and removes an ugly-ish hack.
The first change in `_writeRaw`: This reduces drops an unnecessary if check (`outputLength`), because it is redone in `_flushOutput`. It also changes an if/else statement to an if statement, because the blocks were unrelated. The second change in `write`: This consolidates code in #write() that handled different string encodings and Buffers. There was no reason to handle the encodings differently, so after splitting them based on Buffer vs encoding, the code is consolidated. This might see a speedup. Shoutout to Ron Korving <ron@ronkorving.nl> for spotting this.
ec42085
to
6acff8f
Compare
PTAL @mcollina 😄 edit: CI: https://ci.nodejs.org/job/node-test-pull-request/5402/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if CI is green and the perf results are still confirmed.
I'm seeing regressions with strong significance when both chunks <= 1 and length <= 1024 for strings (mostly the 'bytes' type in the benchmark):
|
@brendanashworth Any chance you could look into those performance regressions? |
A rebase is needed. |
Ping. What do we want to do with this one? |
Checklist
make -j4 test
(UNIX), orvcbuild test nosign
(Windows) passesAffected core subsystem(s)
http
Description of change
This commit optimizes outgoing HTTP responses by corking the socket upon each write and then uncorking on the next tick. By doing this we get to remove a "shameful hack" that is years old while at the same time fixing #7914 and getting some performance improvements.
The second commit is just a cleanup commit that makes nothing do even less nothing.
benchmarks!
CI: https://ci.nodejs.org/job/node-test-pull-request/3498/